Robust and Consistent Sampling

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Consistent Weighted Sampling

We describe an efficient procedure for sampling representatives from a weighted set such that for any weightings S and T , the probability that the two choose the same sample is equal to the Jaccard similarity between them: Pr[sample(S) = sample(T )] = ∑ x min(S(x), T (x)) ∑ x max(S(x), T (x)) where sample(S) is a pair (x, y) with 0 < y ≤ S(x). The sampling process takes expected computation li...

متن کامل

Consistent Subset Sampling

Consistent sampling is a technique for specifying, in small space, a subset S of a potentially large universe U such that the elements in S satisfy a suitably chosen sampling condition. Given a subset I ⊆ U it should be possible to quickly compute I ∩ S, i.e., the elements in I satisfying the sampling condition. Consistent sampling has important applications in similarity estimation, and estima...

متن کامل

Consistent Robust Regression

We present the first efficient and provably consistent estimator for the robust regression problem. The area of robust learning and optimization has generated a significant amount of interest in the learning and statistics communities in recent years owing to its applicability in scenarios with corrupted data, as well as in handling model mis-specifications. In particular, special interest has ...

متن کامل

Improved Consistent Weighted Sampling Revisited

Min-Hash is a popular technique for efficiently estimating the Jaccard similarity of binary sets. Consistent Weighted Sampling (CWS) generalizes the Min-Hash scheme to sketch weighted sets and has drawn increasing interest from the community. Due to its constant-time complexity independent of the values of the weights, Improved CWS (ICWS) is considered as the state-of-the-art CWS algorithm. In ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Signal Processing Letters

سال: 2009

ISSN: 1070-9908,1558-2361

DOI: 10.1109/lsp.2009.2023481